2. Assignment 2: Counter Synthesis

Now that we have worked with a Verilog hardware description and simulated it, we can synthesize the circuit to see if it will meet the specification or needs some rework.

The Cadence Software used for synthesis is called Genus. You can look at the tool’s description and play around with TCL a little bit before getting started, because we will have to write TCL scripts.

The lecture part about synthesis is not complete yet, but will be for the lecture next tuesday.

2.1. Tool Documentation System

The Cadence tools are usually very complex, so we will present here only the minimal set of commands required to produce a result.

However, the tool documentation is also a good source of knowledge to understand how the various steps are realised, and how to tune the software.

The integrated documentation also contains a reference of commands callable from a TCL script. This list is useful to find the best way to perform certain operations.

To open the documentation system:

  • Open a new terminal
  • Load the tools
  • Call the command cdnshelp

A window like the following should open:

../../_images/cdnshelp.png

It is encouraged to use the Cadence Help to read more in details about the tool functionalities.

2.2. Simple Synthesis

If the Synthesis Process has been understood correctely from the lecture, we should know how to prepare the required input data for the synthesis process:

  • The Logic Standard cell library: Provided by the technology design kit.
  • The HDL design: this will be the counter implementation, and a “top level” design using the counter
  • The Clock and Input/Output constraints: write a constraints file

2.2.1. Preparing the synthesis run folder and script

To start working on synthesis, create a new folder for this assignment:

~ $ mkdir dds17/assignment2

~ $ cd dds17/assignment2

dds17/assignment2 $

You can also prepare the synthesis script right away.

Create a file called synthesize.tcl, and open it in your text editor.

2.2.2. Preparing the synthesis top level

The counter we have designed is a generic component, which means it has an undefined parameter, namely its width.

In the same way we need a testbench to simulate the design, we will need a module representing the physical entity we want to synthesise, which will contain the counter instantiated with a specific fixed width.

This module, called “top level”, or short “top”, will for now only contain an instance of the counter, and feed through the counter’s input and outputs.

We can choose to synthesize a 32 bits counter. Save the top module to a file called “counter_top.v”.

../../_images/counter-top.png

Warning

Make sure all the input/outputs of the counter are progragated in the top level, otherwise the logic optimiser will see parts of or all the design as useless and remove it.

Warning

Leave the original counter file in the assignment1 folder, we will refer to this file later, don’t copy it to the assignment2 folder.

Todo

Make sure the counter from assigment 1 is using a synchronous reset (no “posedge reset” in the always block).

2.2.3. Loading the Technology Library

To start with synthesis, we have to select a set of logic functionsto be used. The technology kit provides a set of functions, with various timing characterisation option.

The library is a file called “xxxx.lib”, our technology has multiple available here:

/var/autofs/cadence/umc-65/65nm-stdcells/synopsys

The documentation for the cells is located at:

/var/autofs/cadence/65nm-stdcells/doc/databook.pdf

Question

Look at the available files, what do “wc”, “bc”, “tc” in the file names mean? Use the databook to find the answer.

Question

Which of these files have the best or worst timing defined?

Question

Which file would you choose to perform synthesis?

Once you have selected a file, you can read it in the tool, by using the set_db library. Add the following line to synthesize.tcl:

## Select technology library
set_db library /path/to/library.lib

Now let’s start the tool and see what happens:

dds17/assignment2 $ genus -files synthesize.tcl

Hint

The tool will load the library and give a lot of informations and warnings. This is perfectly normal and you can keep going.

2.2.4. Reading the Design

The next step is to load the HDL design. To do so, we need to provide a list of files:

  • The counter_top top level
  • The counter from the assignment 1

The command used to read design models is called read_hdl.

We are only using the verilog 2001 syntax, so the command can be added to the synthesis script:

## counter_top is in the current older
## counter is taken from assignment one
## Both files passed as a list, hence the { ... } syntax
## -v2001 means "Verilog 2001" Syntax
read_hdl -v2001 {counter_top.v ../assignment1/counter.v}

If any syntax error is present in the file, the tool will report them, in a similar way as during simulation.

2.2.5. Performing Elaboration

After reading the source files, the whole design needs to be elaborated.

The elaboration step builds the hiearchy of modules and a representation of the logic using abstract representations close to the actual syntax.

To elaborate, simply call the elaborate command.

## Run elaboration
elaborate

Elaboration is generally error-free. Some errors may come up for specific harware definition mistakes, or for example if you forgot to read some design files.

In that case the module instances whose source files are missing will be marked as black-box. Black-boxes can be fine or produce an error, depending on the settings.

At this point, you can use the check_design command, which will print a simple summary of facts for the design. This command is useful to detect missing elements, like source files:

genus@root:> check_design

Checking the design.

Check Design Report

——————–

Name Total

——————————————-

Unresolved References 0

Empty Modules 0

Unloaded Port(s) 0

Question

Are there any error in the check design? If no, keep going.

A few of the checks are important to verify, for example:

  • Multidriven port/pins, are values changed by two logic blocks…this normally cannot be allowed in the hardware.
  • Unresolved References or Black box: They refer to module instances which have no source file, or library characterisation provided from a third-party.

The black-box is usually critical, because it means the provided design is not complete.

For example, if you remove the counter source file from the script, like this:

read_hdl -v2001 {counter_top.v}

The elaboration will output a warning like following:

Warning : Black-boxes are represented as unresolved references in the design. [TUI-273]
: Cannot resolve reference to ‘counter’. : To resolve the reference, either load a technology library containing the cell …. or read in the hdl file containing the module…

And the associated check_design will yield:

Name Total

——————————————-

Unresolved References 1

Empty Modules 0

Unloaded Port(s) 0

2.2.6. Constraining the design

After elaboration, the tool has knowledge of the input/outputs and the design structure. At this point, the design constraints can be read.

Defining the constraints is done by calling a set of commands, using a defacto standard called SDC.

SDC commands are supported by tools from Cadence, Synopsis and the newest Xilinx Vivado Suite, which makes design constraining easy to learn and reuse among various technologies.

To start, create a file called constraints.sdc , which can be empty.

2.2.6.1. Clock Specification

Defining the clocks is the first step when writing a constraints file. Mostly three parameters are required:

  • A clock frequency or waveform (if the clock is not symetrical)
  • A target wire to apply it to.
  • A name to identify the clock. It is usually set to the name of the wire, for clarity.

In our case, the target wire is the clock input of the design.

The clock frequency can be freely chosen, we will start with a frequency of 500 Mhz.

Question

The default time unit is 1 nano second, what is the period of the clock for 500 Mhz?

In the file constraints.sdc, you can add the following line:

## [get_port clock] is calling the command get_port which returns a pointer to an input or output named "clock"
create_clock -name clock -period PERIOD [get_port  clock]

Don’t forget to substitute PERIOD with the period you calculated

Todo

Enrich the constraints to support the clock_sr

2.2.6.2. Load the file in Genus

We can already load the SDC file in genus. To do so, simple use the read_sdc

Add the command to the synthesize script…

read_sdc constraints.sdc

…but for now you an use it from the command line directly

genus@root:> read_sdc constraints.sdc

The command will produce an output summarizing the run SDC commands and their success state:

“create_clock” - successful 1 , failed 0 (runtime 0.00)

2.2.6.3. Add Clock Uncertainty

We have introduced in the lecture the fact that the interconnection cost in timing needs to be calculated based on physical information.

However during the first synthesis, we usually don’t know how the physical layout will be like, i.e how big the allocated area for the circuit is, and where the input and outputs are located.

As a consequence, the calculated timing will merely only reflect the logic gates cost. That is why, we want to overconstraint the design, which means that the actual available time for the logic cells is not the full clock period, but less than that because the interconnection calculation will come later.

This can be easily done by using the clock uncertainty parameter. The provided value will be removed from the actual available time.

The amount of uncertainty is mostly defined using a rule of thumb, we can say for us:

  • 20% for the setup timing calculation
  • 5% for hold calculation

The commands to be added to the constraints.sdc file are:

set_clock_uncertainty SETUPTIME -setup clock
set_clock_uncertainty HOLDTIME  -hold  clock

Todo

Calculate the time required for setup and hold uncertainty. You can define TCL variables and use the [expr ] to calculate in the script using a mathematical expression.

Todo

Read the SDC again and make sure the commands were executed successfuly.

Todo

Enrich the constraints to support the clock_sr

2.2.6.4. Checking the constraints

A useful command to check the constraints is the timing linter: report timing -lint. It outputs a list of checks for timing calculation, and will report for example:

  • Input and outputs without delay, which are ignored during timing calculation
  • Logic whose clock has no constraint set
  • Multiple driver nets.

Todo

Run the report timing -lint before and after reading the SDC file (you may need to restart the tool).

Question

Which Warning dissapear after applying the constraints, which should be solved?

2.2.6.5. Input/Output Delay

As we have seen using the timing linter, the timing analysis will ignore the timing calculation of the input signals until the first flip-flop.

This behaviour is caused by the lack of input to logic timing constraint. This might be strange, but the tool really needs some data and won’t assume “0”. For better understanding of I/O constraining, refer to the lecture Input and Output Delays.

For our design the I/O constraining has little relevance, so we could set it to 10% of the clock period for input and output, which means:

  • 90% of the clock period is available from input to logic.
  • 10% of the clock period is available from logic to output (90% available for the output consumer).

The SDC commands are set_input_delay and set_output_delay , and require three parameters:

  • The name of the clock the time is related to.
  • The actual time.
  • A list of wires to apply the constraints to.

Just like for the clock definition constraint, some SDC commands are available to easily retrieve the pointers to the input and output ports.

These commands are listed in the Cadence Help under Genus Command Reference -> SDC Commands, but we will present some of them here.

Question

Which Input and Outputs should you set a delay on?

Question

What about the reset input?

Once we have a precise Idea of which input/ouput should be constrained with which delay, we can enrich the constraints.sdc file:

set_input_delay -clock clock DELAY  {LIST OF INPUTS RELATIVE TO CLOCK}

set_output_delay -clock clock DELAY {LIST OF OUTPUTS RELATIVE TO CLOCK}

In this example, the introduced SDC commands are:

  • all_inputs/all_ouputs: Self explanatory, retrieves all the Inputs or Outputs
  • remove_from_collection : produces a list based on the first argument, minus the elements from the second argument

Todo

Modify the presented constraints to support the input and outputs relative to clock_sr and clock correctly

Todo

Reload the constraints file and make sure no error appear.

Todo

Rerun report timing -lint, you should notice some changes.

2.2.6.6. Output Load

The last constraint to be applied for now is the output load. Indeed, the outputs of our design might be connected to another design part on the chip, or to the external world.

In either case, the outputs will be driving a certain capacitance load, which value impacts the speed of the output, hence the timing calculation.

The command set_load sets the output load:

set_load -pin_load 0.2 [all_outputs]

In this case, we define all the outputs to be driving a last of 20pF, which would be roughly the capacitance when connecting the output to a pad output.

Todo

Reload the constraints file and make sure no error appear.

Todo

Rerun report timing -lint, you should notice some changes.

2.2.7. Creating Timing Groups

When performing timing analysis, the tool will by default make a list of all the timing results.

However, we have already treated various types of connections differentely:

  • The clock speed mainly defines timing for the pipeline stages (from one register to another)
  • The input and output delays and load are more related to the input/outputs paths to and from the registers.

Timing groups are a feature of the tool to group the logic paths depending on their types and help make timing analysis clearer.

Timing groups can also be used to help with optimisation on certain difficul logic groups, but it is usually not necessary, or only for very complex designs.

In our case, we could just create the default standard groups, which are:

  • Input to register
  • Output to register
  • Input/Output (logic which doesn’t go through a register)
  • Register to Register (internal pipeline stages).

To do so, let’s add the following lines to the synthesize.tcl script:

set all_regs [all des seqs -clock clock]
define_cost_group -name C2C
path_group -from $all_regs -to $all_regs -group C2C -name C2C

set inputs [all des inps -clock clock]
define_cost_group -name I2C
path_group -from $inputs -to $all_regs -group I2C -name I2C

set outputs [all des outs -clock clock]
define_cost_group -name C2O
path_group -from $all_regs -to $outputs -group C2O -name C2O

define_cost_group -name I2O
path_group -from $inputs -to $outputs -group I2O -name I2O

Todo

Modify the presented constraints to support clock_sr and clock correctly

2.2.8. Performing synthesis

Once the design is loaded and constrained, synthesis can be performed.

This step is fully automated, but the effort the tool should put into optimisation and the number of optimisation tries can be customized.

For this design, we will just set the effort to medium and perform one optimisation pass. Once the Logic design is stable and seems not too complex, the effort can be set to high to make sure the tool returns the best possible result:

Add the following lines to synthesize.tcl:

synthesize -to_mapped -effort medium
synthesize -to_mapped -incr -effort medium

Todo

Run the script now up to synthesis.

2.2.9. Quality and Timing report

You may have seen some results on the console output, but a lot of text is generated, so using the report mechanism is the best way to find issues.

The first report you can look at is the Quality Of Result or qor report:

genus@root:> report qor

Todo

Identify facts about the design like Area, number of cells, timing slack for the various path groups.

Question

Is the timing specification met?

You can also save the qor report using simple IO redirection:

genus@root:> report qor > qor.txt

Save the report to a file to be able to look back at these results:

genus@root:> report qor > qor_first.txt

The second useful command is the report timing in non-linting mode. It will return and print the worst path in the design:

genus@root:> report timing

You can look at the options of this command to print more paths or specific paths based on a search pattern for example:

genus@root:> report timing -h

The timing report output looks like in the following picture:

../../_images/timing.PNG

Question

Is it a setup or hold check?

Question

Can you understand the list of delays, to which logic gate do they belong?

Question

Can you find the clock uncertainty constraint?

Question

Do you think a Hold check is needed here? Why?

2.2.10. Saving the output

After synthesis, we expect a list of interconnected logic gates as a new representation of the logic.

This list can simply be written back to a Verilog file, which will contain instances of the logic cells instead of an abstract syntax.

genus@root:> mkdir netlist

genus@root:> write_hdl > netlist/counter_top.gtl.v

Todo

Open the netlist/counter_top.gtl.v file to see the result of synthesis.

2.3. Looking at the design using the GUI

From the command line, you can open a GUI window with which you can look at the schematic:

genus@root:> gui_raise

2.4. Performance evaluation

By tuning the target clock frequency up, you should at some point get negative timing slacks for the design.

An interesting point here is that the timing of the shift register implementation should be easier to meet than the half-adder one.

Todo

Modify the constraints to increase the clock frequency.

Todo

Run the synthesis until some negative slack appears.

Question

Can you see the Shift register counter still meeting timing while the half-adder fails?

Tip

You can reduce the counter width to make the run faster.

2.5. Post Synthesis Simulation

After a technology translation has been performed, it is a good idea to simulate the design again, in case the process was wrongly configured and the nelist produced doesn’t offer the same functionality as before.

Doing so is quite easy, because the main design and the synthesised one usually have the same set of input/outputs.

You can:

  • copy the testbench from assignment 1 to assignment 2
  • change the counter instance from “counter” to “counter_top”
  • Remove the size parameter because the physical design has a fixed size
  • Run the simulator including the verilog models of the standard cells

then run irun, passing the testbench and synthesized files:

dds17/assignment2/ $ irun -timescale 1ns/1ps -access +rw -gui counter_tb.v /var/autofs/cadence/umc-65/65nm-stdcells/verilog/uk65lscllmvbbr.v netlist/counter_top.gtl.v

Question

Is the design still functional?

2.6. Reset Type

Open the schematic GUI in Genus, and follow the reset signal.

Question

The reset should be synchronous for now. How is it connected to the logic?

Try to change the reset in the design by adding posedge reset inside the always@(…) definition.

This should make the reset now asynchronous.

Now that the reset is Asynchronous, it should not be timed anymore, this can be reached by defining the reset input as “false path” Add this line to the constraints:

set_false_path -through [get_port reset]

Todo

Remove the reset port from input delay constraints. Don’t forget the reset_sr reset.

Todo

Re-Run the synthesis

Depending on how you wrote the Verilog description, the tool might fail during elaboration.

Question

Is elaboration successful? If no, can you understand why and how to fix the design.

Tip

If not successful, remember the reset wil be asychronous, while the other signals remain synchronous. Are those signals compatible with each other to create logic?

Once successful, open the GUI and follow the reset signal.

Question

How is it different now?

Save the QOR report:

genus@root:> report qor > qor_async.txt

Question

What are the differences in term of gates count, area and timing?

2.7. Clock gating optimisation

Open the schematic GUI in Genus, and follow the enable or enable_sr signal.

Modify the synthesis script to enable clock gating optimisation before elaboration is performed:

## Enable clock gating
set_db lp_insert_clock_gating true

Re-Run the synthesis, the enable conditions like the enable signals should have been moved to logic gates which enable and disable the clock.

Todo

Follow the enable signal in the GUI, can you find the clock gating logic?

Now Simulate the design again, and try to see the clock being gated in the waveform:

  • Right click in the design browser on the counter, and select Send to schematic tracer
  • In the schematic, look for a counter Flip-Flop, selec its clock pin, then with a right click, select the Send to -> Waveform menu.
  • Re-Run the simulation from the waveform window (Reset and Play buttons), the Flip-Flop clock pin should toggle like the clock, but be still during a disable phase.